Republic of Tuva
Empowering Children to Create AI-Enabled Augmented Reality Experiences
Zhang, Lei, Zhou, Shuyao, Liaqat, Amna, Mak, Tinney, Berengard, Brian, Qian, Emily, Monroy-Hernández, Andrés
Despite their potential to enhance children's learning experiences, AI-enabled AR technologies are predominantly used in ways that position children as consumers rather than creators. We introduce Capybara, an AR-based and AI-powered visual programming environment that empowers children to create, customize, and program 3D characters overlaid onto the physical world. Capybara enables children to create virtual characters and accessories using text-to-3D generative AI models, and to animate these characters through auto-rigging and body tracking. In addition, our system employs vision-based AI models to recognize physical objects, allowing children to program interactive behaviors between virtual characters and their physical surroundings. We demonstrate the expressiveness of Capybara through a set of novel AR experiences. We conducted user studies with 20 children in the United States and Argentina. Our findings suggest that Capybara can empower children to harness AI in authoring personalized and engaging AR experiences that seamlessly bridge the virtual and physical worlds.
- Asia > South Korea > Busan > Busan (0.05)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Education > Curriculum > Subject-Specific Education (0.93)
- Information Technology (0.92)
- Education > Educational Setting > K-12 Education > Primary School (0.46)
Reality Proxy: Fluid Interactions with Real-World Objects in MR via Abstract Representations
Liu, Xiaoan, Jia, Difan, Liu, Xianhao Carton, Gonzalez-Franco, Mar, Zhu-Tian, Chen
Interacting with real-world objects in Mixed Reality (MR) often proves difficult when they are crowded, distant, or partially occluded, hindering straightforward selection and manipulation. We observe that these difficulties stem from performing interaction directly on physical objects, where input is tightly coupled to their physical constraints. Our key insight is to decouple interaction from these constraints by introducing proxies-abstract representations of real-world objects. We embody this concept in Reality Proxy, a system that seamlessly shifts interaction targets from physical objects to their proxies during selection. Beyond facilitating basic selection, Reality Proxy uses AI to enrich proxies with semantic attributes and hierarchical spatial relationships of their corresponding physical objects, enabling novel and previously cumbersome interactions in MR - such as skimming, attribute-based filtering, navigating nested groups, and complex multi object selections - all without requiring new gestures or menu systems. We demonstrate Reality Proxy's versatility across diverse scenarios, including office information retrieval, large-scale spatial navigation, and multi-drone control. An expert evaluation suggests the system's utility and usability, suggesting that proxy-based abstractions offer a powerful and generalizable interaction paradigm for future MR systems.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > New York > New York County > New York City (0.14)
- North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.14)
- (16 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Health & Medicine (0.94)
- Leisure & Entertainment > Games (0.46)
Exploring the Innovation Opportunities for Pre-trained Models
Park, Minjung, Forlizzi, Jodi, Zimmerman, John
Innovators transform the world by understanding where services are successfully meeting customers' needs and then using this knowledge to identify failsafe opportunities for innovation. Pre-trained models have changed the AI innovation landscape, making it faster and easier to create new AI products and services. Understanding where pre-trained models are successful is critical for supporting AI innovation. Unfortunately, the hype cycle surrounding pre-trained models makes it hard to know where AI can really be successful. To address this, we investigated pre-trained model applications developed by HCI researchers as a proxy for commercially successful applications. The research applications demonstrate technical capabilities, address real user needs, and avoid ethical challenges. Using an artifact analysis approach, we categorized capabilities, opportunity domains, data types, and emerging interaction design patterns, uncovering some of the opportunity space for innovation with pre-trained models.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > Russia > Siberian Federal District > Republic of Tuva > Kyzyl (0.04)
- Asia > China (0.04)
- Education (1.00)
- Banking & Finance (1.00)
- Information Technology (0.93)
- (3 more...)
Augmenting Human Cognition through Everyday AR
As spatial computing and multimodal LLMs mature, AR is tending to become an intuitive "thinking tool," embedding semantic and context-aware intelligence directly into everyday environments. This paper explores how always-on AR can seamlessly bridge digital cognition and physical affordances, enabling proactive, context-sensitive interactions that enhance human task performance and understanding.
- North America > United States > New York > New York County > New York City (0.07)
- Asia > Russia > Siberian Federal District > Republic of Tuva > Kyzyl (0.05)
Peek into the `White-Box': A Field Study on Bystander Engagement with Urban Robot Uncertainty
Yu, Xinyan, Hoggenmueller, Marius, Tran, Tram Thi Minh, Wang, Yiyuan, Zhang, Qiuming, Tomitsch, Martin
Uncertainty inherently exists in the autonomous decision-making process of robots. Involving humans in resolving this uncertainty not only helps robots mitigate it but is also crucial for improving human-robot interactions. However, in public urban spaces filled with unpredictability, robots often face heightened uncertainty without direct human collaborators. This study investigates how robots can engage bystanders for assistance in public spaces when encountering uncertainty and examines how these interactions impact bystanders' perceptions and attitudes towards robots. We designed and tested a speculative `peephole' concept that engages bystanders in resolving urban robot uncertainty. Our design is guided by considerations of non-intrusiveness and eliciting initiative in an implicit manner, considering bystanders' unique role as non-obligated participants in relation to urban robots. Drawing from field study findings, we highlight the potential of involving bystanders to mitigate urban robots' technological imperfections to both address operational challenges and foster public acceptance of urban robots. Furthermore, we offer design implications to encourage bystanders' involvement in mitigating the imperfections.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- (32 more...)
- Transportation (0.68)
- Health & Medicine (0.47)
- Leisure & Entertainment (0.46)
Co-Designing Augmented Reality Tools for High-Stakes Clinical Teamwork
Taylor, Angelique, Tanjim, Tauhid, Cao, Huajie, Nicoly, Jalynn Blu, Segal, Jonathan I., George, Jonathan St., Kim, Soyon, Ching, Kevin, Ortega, Francisco R., Lee, Hee Rin
How might healthcare workers (HCWs) leverage augmented reality head-mounted displays (AR-HMDs) to enhance teamwork? Although AR-HMDs have shown immense promise in supporting teamwork in healthcare settings, design for Emergency Department (ER) teams has received little attention. The ER presents unique challenges, including procedural recall, medical errors, and communication gaps. To address this gap, we engaged in a participatory design study with healthcare workers to gain a deep understanding of the potential for AR-HMDs to facilitate teamwork during ER procedures. Our results reveal that AR-HMDs can be used as an information-sharing and information-retrieval system to bridge knowledge gaps, and concerns about integrating AR-HMDs in ER workflows. We contribute design recommendations for seven role-based AR-HMD application scenarios involving HCWs with various expertise, working across multiple medical tasks. We hope our research inspires designers to embark on the development of new AR-HMD applications for high-stakes, team environments.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- (13 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Technology > Telehealth (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols
Chung, John Joon Young, Roemmele, Melissa, Kreminski, Max
We introduce Toyteller, an AI-powered storytelling system where users generate a mix of story text and visuals by directly manipulating character symbols like they are toy-playing. Anthropomorphized symbol motions can convey rich and nuanced social interactions; Toyteller leverages these motions (1) to let users steer story text generation and (2) as a visual output format that accompanies story text. We enabled motion-steered text generation and text-steered motion generation by mapping motions and text onto a shared semantic space so that large language models and motion generation models can use it as a translational layer. Technical evaluations showed that Toyteller outperforms a competitive baseline, GPT-4o. Our user study identified that toy-playing helps express intentions difficult to verbalize. However, only motions could not express all user intentions, suggesting combining it with other modalities like language. We discuss the design space of toy-playing interactions and implications for technical HCI research on human-AI interaction.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
- (23 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.92)
Ukraine captures North Korean soldiers; Russia readies for talks with Trump
Russia appeared to ready itself for talks on the future of Ukraine with United States President-elect Donald Trump ahead of his swearing-in on Monday. "No special conditions are needed for this. What is required is the mutual intent and political will to have a dialogue," said Russian President Vladimir Putin's spokesman Dmitry Peskov on Saturday. But Russia expressed its parameters very quickly. Putin aide Nikolai Patrushev told Russian news outlet KP that a Ukraine settlement should be reached by the US and Russia, without Ukraine and without the European Union.
- North America > United States (1.00)
- Asia > North Korea (0.57)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.44)
- (22 more...)
RealitySummary: On-Demand Mixed Reality Document Enhancement using Large Language Models
Gunturu, Aditya, Jadon, Shivesh, Zhang, Nandi, Thundathil, Jarin, Willett, Wesley, Suzuki, Ryo
We introduce RealitySummary, a mixed reality reading assistant that can enhance any printed or digital document using on-demand text extraction, summarization, and augmentation. While augmented reading tools promise to enhance physical reading experiences with overlaid digital content, prior systems have typically required pre-processed documents, which limits their generalizability and real-world use cases. In this paper, we explore on-demand document augmentation by leveraging large language models. To understand generalizable techniques for diverse documents, we first conducted an exploratory design study which identified five categories of document enhancements (summarization, augmentation, navigation, comparison, and extraction). Based on this, we developed a proof-of-concept system that can automatically extract and summarize text using Google Cloud OCR and GPT-4, then embed information around documents using a Microsoft Hololens 2 and Apple Vision Pro. We demonstrate real-time examples of six specific document augmentations: 1) summaries, 2) comparison tables, 3) timelines, 4) keyword lists, 5) summary highlighting, and 6) information cards. Results from a usability study (N=12) and in-the-wild study (N=11) highlight the potential benefits of on-demand MR document enhancement and opportunities for future research.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.29)
- Europe > Ukraine (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report (0.82)
- Overview (0.67)
- Instructional Material > Course Syllabus & Notes (0.48)
- Education (1.00)
- Information Technology > Services (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Augmented Physics: A Machine Learning-Powered Tool for Creating Interactive Physics Simulations from Static Diagrams
Gunturu, Aditya, Wen, Yi, Thundathil, Jarin, Zhang, Nandi, Kazi, Rubaiat Habib, Suzuki, Ryo
We introduce Augmented Physics, a machine learning-powered tool designed for creating interactive physics simulations from static textbook diagrams. Leveraging computer vision techniques, such as Segment Anything and OpenCV, our web-based system enables users to semi-automatically extract diagrams from physics textbooks and then generate interactive simulations based on the extracted content. These interactive diagrams are seamlessly integrated into scanned textbook pages, facilitating interactive and personalized learning experiences across various physics concepts, including gravity, optics, circuits, and kinematics. Drawing on an elicitation study with seven physics instructors, we explore four key augmentation techniques: 1) augmented experiments, 2) animated diagrams, 3) bi-directional manipulatives, and 4) parameter visualization. We evaluate our system through technical evaluation, a usability study (N=12), and expert interviews (N=12). The study findings suggest that our system can facilitate more engaging and personalized learning experiences in physics education.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.15)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hong Kong (0.04)
- (6 more...)
- Questionnaire & Opinion Survey (1.00)
- Instructional Material (1.00)
- Personal > Interview (0.48)
- Research Report > New Finding (0.34)
- Education > Educational Setting (1.00)
- Education > Curriculum > Subject-Specific Education (0.90)
- Education > Educational Technology > Educational Software > Computer Based Training (0.86)